Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Userspace runtime loader support for sub-library c18n #2267

Open
wants to merge 16 commits into
base: dev
Choose a base branch
from

Conversation

bsdjhb
Copy link
Collaborator

@bsdjhb bsdjhb commented Dec 11, 2024

  • rtld: Remove code to track start and end offsets of text and rodata
  • rtld: Support multiple PT_GNU_RELRO program headers
  • rtld: Support multiple PLTs
  • elf: Add definition of PT_CHERI_BOUNDS program header type
  • elftoolchain: Add support for PT_CHERI_BOUNDS
  • rtld: Support narrower PCC bounds via PT_CHERI_BOUNDS
  • rtld: Require PT_CHERI_BOUNDS to be exact
  • elf: Add definitions for PT_COMPARTMENT and DT_C18NSTRTAB*
  • elftoolchain: Add support for PT_COMPARTMENT and DT_C18NSTRTAB*
  • rtld: Add compartments for sub-object compartments described by PT_COMPARTMENT
  • rtld_c18n: Fix copy/paste typo in _rtld_safebox_code error messages
  • rtld: Use compartment IDs from sub-object compartments for policy enforcement

@bsdjhb
Copy link
Collaborator Author

bsdjhb commented Dec 11, 2024

These are the userspace rtld changes to support the psABI extensions to date for sub-library c18n: specifically multiple PT_GNU_RELRO (already merged upstream), multiple PLTs per object, PT_CHERI_BOUNDS, and PT_COMPARTMENT.

addr = (Elf_Addr)(uintptr_t)obj->relocbase + offset;
for (unsigned long i = 0; i < obj->npcc_caps; i++) {
const char *pcc_cap = obj->pcc_caps[i];
if (addr >= (ptraddr_t)pcc_cap && addr < cheri_gettop(pcc_cap))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won’t work for one-past-the-end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t know if it’s enough to fall back on the PCC cap with that top if no better one is found or whether picking up bounds from the next one if it exists could be a security concern. If the latter then we’d have to keep them non-contiguous and make this inclusive.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, my initial thought is that for relocations with an addend, we should use the address of the symbol to find a compartment or PCC cap, but things like R_MORELLO_RELATIVE may not have a symbol.

For code pointers, do we think we will likely have a code pointer that is one-past-the-end? I could see why a data pointer might do that? One question I also have about both APIs I have added here are if the arguments should be an address and length or just an address?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a check that PCC caps do not overlap, including the top (so limit + 1) which I think means that an off by one will now just fail to resolve to anything. I can't really imagine that there are valid code capabilities that are one past the end.

libexec/rtld-elf/riscv/reloc.c Outdated Show resolved Hide resolved
libexec/rtld-elf/cheri/cheri_reloc.c Outdated Show resolved Hide resolved
libexec/rtld-elf/aarch64/reloc.c Outdated Show resolved Hide resolved
libexec/rtld-elf/rtld.c Show resolved Hide resolved
@@ -533,6 +533,7 @@ typedef struct {
#define PT_PHDR 6 /* Location of program header itself. */
#define PT_TLS 7 /* Thread local storage segment */
#define PT_LOOS 0x60000000 /* First OS-specific. */
#define PT_COMPARTMENT 0x64331380 /* Sub-object compartment. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this should be some PT_C18N_FOO to group things in case there are other C18N headers in future? Especially given the ASCII "C18" in the numeric value.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what else you would call it though. PT_C18N_COMPARTMENT seems redundant.

libexec/rtld-elf/rtld.c Outdated Show resolved Hide resolved
libexec/rtld-elf/rtld_c18n.c Outdated Show resolved Hide resolved
size_t len;

assert(obj->compart_id == 0);
obj->compart_id = compart_id_allocate(name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit confusing that both Obj_Entry and Compart_Entry have a compart_id field, and is just asking for someone to accidentally use the Obj_Entry one

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't necessarily disagree. Having a Compart_Entry for the default compartment doesn't quite work though either (its bounds would be the object's entire address range). In theory, nothing should really be using obj->compart_id directly but using the wrapper function to lookup based on address. I could maybe obfuscate the field name to obj->default_compart_id?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did rename the field in Obj_Entry which exposed a place I had missed. :)

This was used in CHERI-MIPS to enforce stricter PCC bounds for some
ABIs but is no longer used.
Copy link
Contributor

@dpgao dpgao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding 51db786---Good catch! The _rtld_safebox_code and _rtld_safebox_code functions are meant to be a temporary way for user code to wrap function pointers in trampolines. These functions will be removed soon so it is probably not worth fixing the error message any more.

Elf32_Word bkt, nmaskwords;
unsigned long i, jmprel, pltrelsz, pltgot;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the latter three numbers always supposed to be equal at the end of the loop?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are for valid binaries. digest_dynamic1() doesn't report errors, so instead I just use the max to size the plts[] array, then recheck in digest_dynamic2() and fail if all of the plt[] entries are not fully populated.

libexec/rtld-elf/rtld_c18n.c Outdated Show resolved Hide resolved
libexec/rtld-elf/rtld_c18n.c Outdated Show resolved Hide resolved
libexec/rtld-elf/rtld_c18n.c Outdated Show resolved Hide resolved
…relocs

This is a no-op for purecap but uses narrower effective bounds for
capability code pointers in hybrid.
Iterate over all the program headers in obj_remap_relro and remove the
relro fields from Obj_Entry.

Skip the call to obj_enforce_relro() in relocate_object() for the rtld
object as well as the main program object.  obj_enforce_relro() is
called later when it safe to reference globals such as page_size.

Reviewed by:	kib
Obtained from:	CheriBSD
Sponsored by:	AFRL, DARPA
Differential Revision:	https://reviews.freebsd.org/D47884

(cherry picked from commit fda0403eb0839b29b0b271c69c5cb6bfc874a3b5)
@bsdjhb
Copy link
Collaborator Author

bsdjhb commented Jan 16, 2025

So the CI failure is due to cheribsdtest-mt-c18n crashing, and it dies trying to lookup a callee compartment ID because the lookup function fails an assertion that the requested target address is in the bounds of the Obj_Entry:

(gdb) r
Starting program: /bin/cheribsdtest-mt-c18n -h
Assertion failed: (cheri_is_address_inbounds(obj->relocbase, addr)), function compart_id_for_address, file /usr/home/john/work/cheri/git/cheribsd/libexec/rtld-elf/rtld.c, line 6393.

Walking up the stack, it is wrapping a function pointer:

(gdb) up
#7  0x0000000040193ba0 in _rtld_thread_init (pli=<optimized out>)
    at /usr/home/john/work/cheri/git/cheribsd/libexec/rtld-elf/rtld_lock.c:447
447                     WRAP(tmplockinfo.at_fork,               true, 0, false, 

In this case, the function pointer is NULL:

(gdb) p tmplockinfo 
$4 = {rtli_version = 2, 
  lock_create = 0x40b3ed3d [rxRE,0x40b3ed10-0x40b3ee7c] (sentry), 
  lock_destroy = 0x40b3eead [rxRE,0x40b3ee80-0x40b3efe8] (sentry), 
  rlock_acquire = 0x40b3f01d [rxRE,0x40b3eff0-0x40b3f158] (sentry), 
  wlock_acquire = 0x40b3f18d [rxRE,0x40b3f160-0x40b3f2c8] (sentry), 
  lock_release = 0x40b3f2fd [rxRE,0x40b3f2d0-0x40b3f438] (sentry), 
  thread_set_flag = 0x40b3f46d [rxRE,0x40b3f440-0x40b3f5a8] (sentry), 
  thread_clr_flag = 0x40b3f5dd [rxRE,0x40b3f5b0-0x40b3f718] (sentry), 
  at_fork = 0x0, 
  dlerror_loc = 0x40393655 <_thr_dlerror_loc> [rxR,0x40368000-0x403d2000] (sentry), 
  dlerror_seen = 0x40393665 <_thr_dlerror_seen> [rxR,0x40368000-0x403d2000] (sentry), dlerror_loc_sz = 512}

So is it expected that we are installing a wrapper trampoline around a NULL function pointer? It seems nicer from a debugging experience if the caller faults immediately in the caller's context rather than faulting inside the trampoline when it branches to NULL?

@dpgao
Copy link
Contributor

dpgao commented Jan 16, 2025

So the CI failure is due to cheribsdtest-mt-c18n crashing, and it dies trying to lookup a callee compartment ID because the lookup function fails an assertion that the requested target address is in the bounds of the Obj_Entry:

(gdb) r
Starting program: /bin/cheribsdtest-mt-c18n -h
Assertion failed: (cheri_is_address_inbounds(obj->relocbase, addr)), function compart_id_for_address, file /usr/home/john/work/cheri/git/cheribsd/libexec/rtld-elf/rtld.c, line 6393.

Walking up the stack, it is wrapping a function pointer:

(gdb) up
#7  0x0000000040193ba0 in _rtld_thread_init (pli=<optimized out>)
    at /usr/home/john/work/cheri/git/cheribsd/libexec/rtld-elf/rtld_lock.c:447
447                     WRAP(tmplockinfo.at_fork,               true, 0, false, 

In this case, the function pointer is NULL:

(gdb) p tmplockinfo 
$4 = {rtli_version = 2, 
  lock_create = 0x40b3ed3d [rxRE,0x40b3ed10-0x40b3ee7c] (sentry), 
  lock_destroy = 0x40b3eead [rxRE,0x40b3ee80-0x40b3efe8] (sentry), 
  rlock_acquire = 0x40b3f01d [rxRE,0x40b3eff0-0x40b3f158] (sentry), 
  wlock_acquire = 0x40b3f18d [rxRE,0x40b3f160-0x40b3f2c8] (sentry), 
  lock_release = 0x40b3f2fd [rxRE,0x40b3f2d0-0x40b3f438] (sentry), 
  thread_set_flag = 0x40b3f46d [rxRE,0x40b3f440-0x40b3f5a8] (sentry), 
  thread_clr_flag = 0x40b3f5dd [rxRE,0x40b3f5b0-0x40b3f718] (sentry), 
  at_fork = 0x0, 
  dlerror_loc = 0x40393655 <_thr_dlerror_loc> [rxR,0x40368000-0x403d2000] (sentry), 
  dlerror_seen = 0x40393665 <_thr_dlerror_seen> [rxR,0x40368000-0x403d2000] (sentry), dlerror_loc_sz = 512}

So is it expected that we are installing a wrapper trampoline around a NULL function pointer? It seems nicer from a debugging experience if the caller faults immediately in the caller's context rather than faulting inside the trampoline when it branches to NULL?

Right. Having something like

if (data->target == NULL)
        return (false);

at the very beginning of tramp_should_include should fix the problem.

bsdjhb and others added 13 commits January 20, 2025 14:54
Add a new Plt_Entry structure that describes the data associated with
a PLT including the ELF relocation table and PLT GOT.  Count the
number of PLTs and allocate an array of Plt_Entry objects in the
Obj_Entry.  Instead of storing the Obj_Entry pointer in GOTs, store
the Plt_Entry pointer.

The special wrinkles for PowerPC are a bit hackish here and would need
some tweaks before upstreaming.
This was fixed in commit 6a998238523d3b4dac4bd81c61a998fa2fa4fa28 of
CHERI LLVM.
This will permit using fine-grained PCC capabilities in a future
commit.  The code for relocating a single capability follows a similar
flow to Morello's init_cap_from_fragment.

While here, use cheri_init_globals_impl directly for rtld's own
caprelocs on RISC-V.
If an object file includes one or more PT_CHERI_BOUNDS program headers,
derive new code capabilities for each header and derive code pointers
from the code capability that contains the address.

This does not yet check for overlapping bounds or if the resulting PCC
bounds are wider than the phdr due to precision.
Fail with an error message if the resulting PCC bounds do not match
the segment's relocated address and length.
…MPARTMENT

- Recognize the new c18n string table via DT_C18NSTRTAB* and save
  bounded pointers in each Obj_Entry.

- Define a new Compart_Entry type to hold information about a
  sub-object compartment including its compartment ID, virtual address
  bounds, and name.  The name for sub-object is "<obj
  name>:<compartment name>".

  Currently the default compartment for an object does not have a
  suffix.  Possibly it should.

- This requires reworking compartment assignment to be more explicitly
  timed (always after digest_dynamic) rather than a side effect of
  object_add_name.  Since we now always have DT_SONAME if it is
  present, save a pointer to DT_SONAME in Obj_Entry and prefer it for
  the library name for a compartment (instead of using the first name
  added).
…orcement

- Add a helper function to lookup the relevant compartment ID for a given
  virtual address and shared object.

- Save a compartment ID for each PLT (based on the address of the
  associated PLT GOT) and use this as the "subject" (caller) for
  policy enforcement when handling PLT GOT relocations.

- Use the target address of a function call to determine the "object"
  (callee).

Note: rtld currently does not enforce any policy for access to data
via the normal GOT.

Co-authored-by: Dapeng Gao <[email protected]>
@bsdjhb
Copy link
Collaborator Author

bsdjhb commented Jan 21, 2025

I think the remaining question is what to name PT_COMPARTMENT and if it should use a PT_C18N_ prefix?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants